Application of machine learning for inter turn fault detection in pumping system

Pump fault diagnosis is essential for the maintenance and safety of the device as it is an important appliance used in various major sectors. Fault diagnosis at the proper time can reduce maintenance costs and save energy. This article uses a Simulink model based on mathematical equations to analyze the effects of parameter estimation of three-phase induction motor-based centrifugal pumps in inter-turn fault conditions. The inter-turn fault causes a massive in, a massive increase in current, which severely affects the parameters of both motor and pump. These have been analyzed by simulation through the Matlab Simulink model. Later, the results are verified by a hardware in loop (HIL) based simulator. In this paper, machine learning (ML) based artificial neural network (ANN) and ANFIS (ANN and Fuzzy) models have been applied for fault detection. ANN and ANFIS-based models provide a satisfactory level of accuracy. These models provide accurate training and testing results. Based on root mean square error (RMSE), R2, prediction accuracy, and mean validation value, these models are compared to find out which is more suitable for this experiment. Various supervised algorithms are compared with ANN, ANFIS, and lastly, found which is the most suitable for this experiment.

The induction motor is a commonly used device indispensable for various industries and has received augmented attention for its robust construction, high performance, reliability, and maintenance cost 1 . Any kind of fault in induction motor causes drastic consequences in the devices connected with the motor and entire system. If pump is connected with a faulty induction motor, the head value will change, the flow rate will change, and colossal vibration will create severe damage 2 . Breakdown of the entire system causes system damage and creates enormous energy loss, and sudden unplanned downtime causes huge maintenance costs. It is reported that 30-40% failure is seen in induction motors for stator inter turn fault 3 . This is actually an electrical fault, and this electrical fault is very sensitive, which causes severe damage. Only 10-20% inter-turn fault causes a massive rising of current in induction motor, which causes insulation losses in the windings 4 . Electrical faults are categorized both as stator and rotor faults 5 . The rotor fault which is seen in the induction motor is a broken rotor bar. Stator faults are mainly three: phase-to-phase fault, inter-turn fault, and phase-to-ground fault. Among those, the inter-turn fault is significant and critical 6 . This inter turn fault hampers induction motor operation and pumping operation. Besides mechanical and hydraulic faults, the electrical fault also hampers the pump performance. A centrifugal pump is a rotating machine used to transfer fluid through pipes. Sudden shutdown of the pumping system causes a huge loss in maintenance 7 . It has been analyzed that 70% of the maintenance cost is seen for the pumping system. So it is required to improve the maintenance technology to reduce cost. Various researches have been done for inter turn fault detection in induction motors. Voltage between lines, neutral and star point of motor were used for fault detection. This was used as model of the motor, and an imbalance was created due to an inter-turn short circuit fault. Before the total breakdown and significant damage to the devices, this imbalance should be identified 8 . Negative sequence impedance was estimated and used as a fault indicator in the research. The negative sequence impendence was seen due to an imbalance in the motor. Oscillation used for Park transformation current was used for fault detection, which was created for imbalance.
To identify this problem, space vector analysis is required 9 . Electrical fault can be detected by motor current signature analysis (MCSA) and vibration analysis. To estimate negative impedance in the motor, robustness with respect to unbalanced voltage supply was added as an approach 10 . Frequency spectrum and fast Fourier transformation (FFT) analysis are also helpful for induction motor fault detection. Wavelet package transformations (WPT) and FFT were used along with some sort of classifier in some work 11,12 .

Proposed model
The proposed research has been done based on some mathematical equations of a pumping system that is coupled with an induction motor. These equations are applied for building the Matlab Simulink model of an induction motor-based pumping system. As fault creation in a real-time system causes significant damage, a mathematical equation-based simulink model has been developed to analyze the healthy and faulty situations. At first, the model was analyzed in healthy conditions without changing any parameter value. Then, to analyze the faulty situation, phase A winding has been sorted such that the current in phase A suddenly increases, and it also helps to increase the phase B and phase C current. Due to changes in current value, torque and speed value also change, and the pump parameters like pressure and flow rate are forced to change as the pump is coupled with the induction motor and runs at the same speed up to three subheading levels are permitted. Subheadings should not be numbered.

Mathematical model of induction motor-based pumping system
This section describes various equations that express voltage, current, and flux of stator and rotor of induction motor.
where (1) V S = R S I S + P S , www.nature.com/scientificreports/ V represents the voltage, I shows current, flux is represented as i , here, "s" and "r" represent stator and rotor respectively, a, b, c denote the three-phase system. as 1 , as 2 denote unfaulty and faulty parts of the stator, respectively. Here P is the Laplace operator, the derivative operator d dt is replaced by P.
These equations show shorted part of stator winding voltage. β denotes shorted turn.
The resistance matrix is shown as Here, the equations represent mutual inductance and self-inductance of stator winding [12][13][14] .
β represents number of turns in phase a, θ r represents rotor position, L s shows self-inductance, L r shows rotor self-inductance, and L sr represents stator to rotor mutual inductance. Water is pumped out from a constant level water tank, and the pumping system consists of water tank, an asynchronous three-phase induction motor, and other parts. The tank receives liquid with input flow represented byq v 1 Output flow of the control valve is represented by q v 2 . With the help of fluid mechanics and fundamental physics laws, plant dynamics analysis has been done, and a mathematical model has been developed 21 . This mathematical model includes the mathematical models of centrifugal pump and tank. The counterpart of Newton's law of force is that angular acceleration is proportional to the torque on the axis. So, the equations show the motion for the motor and pump set.
J shows the moment of inertia. Here moment of inertia is the constant of proportionality in specific case. Active torque of asynchronous motor is shown by M MT and accleration torque is shown by M a , passive or resistive torque of pump is shown by M p and viscous torque is M ζ 22 . Network frequency is shown by f, and it is assumed that stator pole pair number is one. Here following equation shows the torque of the asynchronous motor.
Viscous torque and passive torque can be represented by Equation 18 shows the basic parameters of the centrifugal pump, and the pump flow rate is shown by Q, H shows the pump head, and the angular velocity is shown by ω. Peripheral cross-section of the impeller channels and meridian component of velocity express the pump flow. Head value is proportional to angular velocity, as the flow rate is proportional to angular velocity 23 .
(7) V as2 = βR s I as − I f + ρ as2 = R f I f , (8) s = L s I s + L sr I r , www.nature.com/scientificreports/ In the last equation, the pump efficiency coefficient is denoted by which is constant, and in different modes, it changes to some extent, reflecting to the other parameters. The total operating system H Total can be defined as Here the static head is represented by H S , the dynamic head is shown by H D , the pressure on the surface of the water in the receiving tank is shown by P RT , and the pressure on the surface of the water in the reservoir tank is represented by P RES 24 .
Based on pump height pressure changes and it is considered negligible value. But atmospheric pressure changes with the height. Equation shows the change in pressure and elevation difference between the reservoir and receiving tank. But this is not so significant and considered as negligible.
So the equation will be The difference between the point of discharge and the surface of the reservoir into the receiving tank is the static head which is shown by H S .The system's static head will vary between maximum and minimum head values because the reservoir's water level also varies.
Here top water level is TWL, and the bottom water level is BWL. Within the system, as a result of dynamic friction head is generated. Basic Darcy Weisbach equation helps to calculate the dynamic head Here the loss coefficient is shown by K, velocity in the pipe is shown by and acceleration is g.

Now velocity is shown as
Here flow rate is shown by Q through the pipe, and the cross-sectional area is shown by A. Area A is shown as The loss coefficient K is a form of two elements: K fittings is shown as pumping the water from the reservoir to receiver tank fittings used for the pipeworks of the system. K pipe is associated with the length of the pipe, friction, and the diameter of the pipe.
Here F shows the friction factor, L shows the pipe length, and D is the pipe diameter. By the modified version of the Colebrook White equation, the friction coefficient f can be found.
Here roughness factor is k, and the Reynolds number is Re. The roughness factor k is a standard fixed value collected from standard tables and depends on the pipe's material and pipe condition. For any flow in the pipe, the following formula is used for the calculation of the Reynolds number 25 : ϑ is the kinematic viscosity. Operation of the pumping system is based on affinity law. First affinity law is shown in the equation where flow Q is proportional to shaft speed N.     www.nature.com/scientificreports/ As per, the second affinity law, the head is proportional to the square of the shaft speed.
The power of the pump can be calculated as Here P is the power requirement for the pump, H is the head, g acceleration gravity, and water's density.

Computer simulation
The experiment was done with 3 phase, 50 Hz, 415 V, 0.75 HP squirrel cage induction motor coupled with a VFD-based centrifugal pump with 2800 RPM speed and 23.5-m head value. Under healthy conditions, the three-phase induction motor produces only positive sequence currents and is symmetrical. When symmetry is disturbed during the fault, it generates positive, negative, and zero sequences. The experiment was done by creating an inter-turn fault in the induction motor and analyzing the parameter changes both for the motor and coupled pump. A Simulink model of a three-phase induction motor with turn fault in one phase winding has been built with the help of MATLAB software. The Simulink model has been developed as experimentally it is challenging to create fault due to shorting of high percentage value. After completing the developed model, the model is verified both in healthy and faulty conditions. In different levels of shorting in one phase winding, the model is simulated, and the phase current values are stored in the MATLAB workspace. Negative sequence current, positive sequence current, and zero sequence currents are calculated from these values. The next step is to verify how the inter-turn fault affects various pump parameters coupled with the induction motor. After the simulation process, the results are verified by the OP5700 real-time simulator (hardware in the loop) for validation. In another part of the experiment, ML algorithms have been implemented on simulation data collected through MATLAB to identify and predict faults in induction motor-based pumping systems and analyze which algorithm is suitable for detecting the fault. The Simulink model was built based on the mathematical equations in "Mathematical model of induction motor-based pumping system". Figure 1 shows the block diagram of inter turn fault detection in an induction motor-based pumping system. The details of the induction motor are: stator resistance R s is 0.288 Ω, rotor resistance R r is 0.158 Ω, stator inductance L s and rotor inductance L r is 0.0425 H and 0.0438 H, respectively, mutual inductance L m is 0.0412 H and inertia J is 0.4. Here the number of poles is 2.
Main input parameter shows per unit changes from positive sequence current and negative sequence current for classification of severity of the fault level in phase windings.

Feature Extraction
Application of Machine learning algorithm for training and testing the model.

Find out best suited algorithm for inter turn fault detection in pumping system
Verified the data and simulation results by HIL loop based device (OP5700) END Figure 1. Flow chart of inter turn fault detection in induction motor-based pumping system. www.nature.com/scientificreports/ There will be no short circuit turn when the system is in healthy condition. But when the system is in a fault condition, the negative sequence current will increase once the turn fault percentage increases. In the proposed research, up to 40% of the inter-turn fault has been measured. 40%, the value of δ varies from 1 to 0.98, for short circuit level of 0 to 40% Fig. 2 shows the Simulink model of induction motorbased centrifugal pump system, which has a three-phase source, VFD drives, and an induction motor coupled with a pump. Table 1 shows the magnitude of phase current and sequence component current. Though the simulation model has been analyzed from 0 to 40% short circuit fault, the HIL OP5700 results have been compared between healthy and 40% short circuit fault to check the highest fluctuation during extreme fault conditions. When the inter-turn fault occurs, phase A current increases, and it helps to increase the phase B and phase C current. During fault conditions, the torque response of the motor suffers from oscillations. If the faults occur motor suffers from huge oscillations and when inter turn faults happen motor faces huge oscillations. When the percentage of the turn increases the torque value also will increase and speed value will decrease. The motor is coupled with the pump so that speed is fed to the pump also, and once the pump operates in a fault condition, the flow rate value suddenly increases and pressure decreases. Now, if the pressure goes below the  www.nature.com/scientificreports/ vapour pressure, a cavitation problem will occur, and a sudden increase in flow rate creates a vibrational problem in the overall system.

Results and discussion
The figures show the performance curve of current, speed, and torque with respect to time and pump curves for flowrate vs. head value both in a healthy and faulty conditions. All the results were obtained from the OP5700 HIL-based device, which verified the simulation results. Figure 3a,b show the healthy and 40% inter turn fault condition of stator current. Phase A, B, and C current increases as the fault occurs. Figure 4a,b show the healthy and 40% inter-turn fault condition speed value. Similarly, Fig. 5a,b show the healthy and 40% inter-turn fault condition torque value. During a fault, condition motor suffers from oscillations. The size of the oscillations changes when the percentage of turn increases at the same load condition. As the oscillations increase the machine's rated power, the oscillation in the torque also increases. Figure 3 shows that the current value increases for phase A. Once inter-turn fault occurs and if the number of turns increases, the current value also increases. Increase in phase B and phase C current is also accelerated. Similarly, speed and torque values also increase during fault conditions, as shown in Figs. 4 and 5. Once the motor speed increases, the pump speed also increases. Increase in speed causes an increase in flow rate and a decrease in head value. Figure 6a,b show the pump performance curve and system curve in healthy and 40% inter-turn fault condition. Figure 7 shows the hardware setup of the HIL device.

Application of ML approach in the induction motor-based pump fault detection
Generally, ML algorithms are of two types, supervised and unsupervised, and supervised algorithms have target variables that are formed from the predicted value of input variables. Figure 8 shows the generalized block diagram of the proposed research after the data collection to find out the best-suited algorithm.
The powerful technique ANN is used for the diagnosis of induction motor more accurately. Neural network (NN) is one of the pattern classifiers. Many problems can be solved by using pattern classification of NN k, which involves variable recognition. For induction, motor fault diagnosis it cannot be entirely described or predicted. Mathematical model-based computational algorithm is ANN which behaves like the human brain and thinking process. It has various features like similar parallel processing, self-organizing, self-learning, classification, www.nature.com/scientificreports/ and non-linear mapping abilities. Combination of Fuzzy and ANN is ANFIS 26,27 , and it is combined to improve speed, fault tolerance, adaptiveness, and to obtain a better modeling system. Based on RMSE, R 2 values, it can be compared which algorithm is suitable for inter turn fault detection in an induction motor-based pumping system. The inter turn fault detection in induction motor ANN and ANFIS models are proposed. Artificial immune system for ANN has self-adaptive control and performs better for continuous nonlinear function. The process   www.nature.com/scientificreports/ can be done through online monitoring 28 . ANN is highly interconnected and similar to the human brain and follows a learning process like human beings 29 . Units have interconnections between them and have weights that are multiplied by the values that go through them. Unit has a fixed input known as bias; each unit forms a weighted sum where bias is added. Transfer function analyzes this sum. Prediction of NN depends on training and testing data. The main work of ML algorithm is to make feature extraction. The feature extraction is the important tool which helps to classify the training and testing data for analysis. The most hybrid features are root mean square (RMS), kurtosis value (KV), root amplitude, peak-to-peak value (PPV), standard deviation (SD), skewness value (SV), clearance factor, crest factor (CF), impulse factor (IF), shape factor (SF), and mean value (MV). These statistical features help in the analysis of each signals during healthy and faulty conditions. Feature extraction techniques are used for the statistical analysis for reduction of the large amount of information contained in the current signal which is reflected in the overall signal. The raw current signal is used for the conversion of multiple features for supporting intelligent system to analyse and classify healthy and faulty situation. This overall procedure is called as feature extraction. The statistical features and equations are described in Table 2 which has been used for proposed research.  www.nature.com/scientificreports/ In those equation x is the signal and N is the no of samples. ANN is powerful techniques by which induction motor faults can be identified. Neural networks are pattern classifiers and are used for pattern classification problems. The most commonly used neural networks are multilayer feedforward network or Levenberg Marquardt method. In the proposed research Levenberg Marquardt method has been used.
The success of training is greatly affected by the proper selection of inputs 30 . Learning process uses testing data and NN constructs input-output mapping. Iteration based on minimization or optimization of some error measured between the output produced and the desired output can adjust the weights and bias. This process is repeated till an acceptable criterion for convergence is obtained. NN consists of the input, hidden, and output layers, as shown in Fig. 9. Output layer consists of six neurons like health condition, 5 turns short circuit, 10 turns short circuit, 20 turns short circuit, 30 turns short circuit, and 40 turns' short circuit. The algorithm can choose the number of hidden layers by trial and error process.
As the current parameter is the main reason of inter turn fault, so the stator currents are collected in both healthy and faulty conditions such as different shorted turn conditions. Then the currents should be converted in qd frame. The current signals are preprocessed using feature extractions and these features are fed to the classifiers for the diagnosis of induction motor faults. The first case the experiment has been done in healthy conditions

Find out the suitable ML algorithm for classification process
Step 1: Data Collection Step 2: Data pre processing Step 3: Condition, classification and diagnosis of the pumping system   Crest factor (CF) X crest =

X peak Xrms
Impulse factor (IF) Clearance factor X clearence = www.nature.com/scientificreports/ and current data have been collected. Then 5, 10, 20, 30 and 40 turn conditions current data have been collected during fault conditions. The three phase currents are converted in qd frame through clerk transformation. The 5000 samples were collected for the each signal duration of 0.2 s. Each signal was divided into 50 segments of 1000 samples. Feature extraction is needed for the processing of raw data signals, then six features were extracted from these segments and as there were two signals so total 12 dimension of dataset has been formed. The total dimension of the dataset is 12 × 300 . These features were used as input features of neural networks. The dataset then was spitted into training data set, cross validation set and testing data set. The training data set is 70%, 15% for cross validation and other 15% for testing data set shown in Table 3. Training data set was used to train the model and cross validation and testing data set were used to evaluate the performance of the classifier for finding out the accuracy of the model. The means square error was calculated via network to adjust the weight and find the ultimate accuracy rate through training and testing data set. Levenberg Marquardt back propagation is chosen for training purposes, and training and testing data help obtain the average minimum square error (MSE) for ANN. The average MSE values concerning processing elements present in the hidden layer are shown in Tables 4 and 5    www.nature.com/scientificreports/ An intelligent system Neuro-Fuzzy technique ANFIS is used to model and control ill-defined and uncertain systems. Input/output data pairs of the system under consideration build ANFIS. ANFIS is the combination of ANN and Fuzzy, which is used for the learning ability of the fuzzy system. ANFIS consists of five layers 31,32 . Layer 1 is the fuzzification layer which calculates the membership function. Layer 2 represents the rules layer, whose output is the firing strength of each node. Layer 3 highlights the normalization layer, which normalizes the calculated firing strength. Layer 4 shows the consequent layer, whose output layer is the product of normalized firing strength and the fuzzy rules consequent polynomial. Layer 5 shows the overall output and defines the defuzzification layer, whose output is the overall ANFIS output. The problems of continuous changes in mobile learning environments are solved by ANFIS 33 . The proposed ANFIS model can be used for modeling the learner context. Defining input and output values, Fuzzy sets for input values, Fuzzy rules, and creating and training the NN are the steps applying ANFIS to the learner model 34 . Here also stator currents are collected and transformed in qd frame through clerk transformation. The ANFIS model uses discrete wavelet transformation (DWT) or continuous wavelet transformation (CWT). The CWT is a similar concept like FFT but it uses number of wavelets as a function instead of sine and cosine function. The wavelet consists of two parameters like scale and translation and the signal is shown in two dimensional time scale plane instead of one dimensional plane. The Eq. (35) shows CWT function.
Here Wx is the wavelet transform linked with two parameters and here a is the scale parameter and b shows as time parameter. ϕ is wavelet function, and x (t) is the original signal. DWT is used to collect rotor faults and it is mainly used for feature extraction in the proposed research. The ANN and ANFIS models are implemented and compared in the proposed work to find a better performance. R 2 and root mean square error (RMSE) are used to find out the best-suited model for fault detection of induction motor-based pumping systems. RMSE and R 2 are used for the analysis of the faults for ANFIS mainly. Based on DWT the performance evaluation can be performed (Table 6).  www.nature.com/scientificreports/ Same like ANN for ANFIS also 5000 data samples have been collected and then divided into 100 sections with 500 samples for each sections. These 100 sections are used for feature extraction. 100 samples are used for each condition like healthy or turn fault condition. Totally for six condition 600 samples have been formed. Among these 300 samples have been used for training purpose and 300 samples used for testing purpose (Table 7). Table 8 shows the RMSE value for training and testing data in different conditions. The performances of the obtained ANN and ANFIS models are also compared after building the model is done. RMSE and R 2 have comparative statistical values for ANN and ANFIS models, which are given in Table 9. Validation data of the model is 0.05. Prediction accuracy is also measured by R 2 and RMSE. Prediction accuracy for ANN (R 2 is 100 and RMSE is 0.054) is better than the ANFIS model (R 2 is 96.91 and RMSE is 0.121). This the average comparison of total conditions. ANN and ANFIS models both performed well and are compatible for fault detection and able to predict the fault. However, based on RMSE and R 2 of training and testing data, ANN performed better than ANFIS in the proposed experiment. The ANN model has been applied up to 200 epochs, and the best validation has been received in 150 epochs. Figure 12 shows the best fit value of the proposed model of ANN with respect to training, testing, validation and overall values.   www.nature.com/scientificreports/

Comparison of ANN and ANFIS with existing studies
Various previous works have been done for inter turn fault detection of induction motor and motor-based pumping systems. Till now, research on the inter-turn fault of induction motor analyzed only changes of induction motor parameters after the fault occurs. But when the motor faces an inter-turn fault problem, the coupled pump is also affected, and the parameters of the pump also change. In the proposed research, including the change of inter-turn fault affected motor parameters, the change in pump parameters also have been analyzed. Current coordinate transform algorithm for inter turn fault analysis in induction motor was developed. Mexbios development studio was built up to analyze the parameter changes in induction motor during a fault condition. www.nature.com/scientificreports/ Though this is possible to implement in industrial applications, this process cannot predict the fault before the fault occurs and major damages are seen 35 . The proposed model is a simple, easy process and helpful for less amount of data and predict the fault before massive failure. ANN algorithm for inter turn fault detection in induction motor with respect to various turns was used. The experiment created up to 10% inter-turn fault, and phase A current changes were obtained. The experiment analyzed the unit change of positive sequence current from negative sequence current. Experiment was done up to 54 epochs. Here experimental analysis was done for a small level of values 36 . NN model in three different conditions: no load, 50% load, and full load condition for five different motors for inter-turn fault analysis was developed. Up to 15% inter-turn fault was developed, and the accuracy rate of the NN model for various motors varied from 88 to 99% 37 . The novel wavelet analysis was developed in a research.
The model was built to analyze inter-turn fault based on discrete wavelet transformation using Park's vector transformation. Performance analysis was done for healthy and various turn fault conditions. MSE was obtained for performance accuracy analysis of healthy and faulty situations 38 . Other researches are based on FFT analysis and Park transformation, but these researches are not suitable for predictive control models and not useful for heavy industrial applications 39 . The proposed method of ANN has been used in the current research, which developed the Matlab model to analyze the parameter changes of induction motor and pump during inter-turn fault. As it is the Simulink model, so high range that is up to 40% of inter-turn fault is possible to be created for analysis.
Here ANN and ANFIS models have been implemented in the experimental results, and it is found that the performance of ANN is better than the ANFIS model.
Similarly, the authors also have implemented various supervised ML algorithms like SVM, K-NN, Decision tree, Naïve Bayes, Regression analysis with ANN and ANFIS. Based on accuracy rate, prediction speed, and training time, algorithms are compared to find out most suitable algorithm for this experiment. Generally, ML algorithms have target variables that are to be predicted from independent variables, and these variables generate functions for mapping of input to achieve desired output. After that training process should be done for the better achievement and for more accuracy. The training process is going on until the desired accuracy rate is obtained. No target variable is required for unsupervised learning as it follows the clustering process. SVM is a well-known pattern recognition algorithm mainly used for classification and regression.

Applications and comparisons of various ML algorithms with ANN and ANFIS
SVM has a hyperplane and margin by separating the dataset and performing the classification task. Optimum hyperplane in the SVM maximizes the width of the hyperplane to avoid overlapping the classes; this is the classification process. Margins are classified between hard margins and soft margins. Since the present diagnosis deals with the non-linear classification problem, a soft margin is used. SVM accuracy depends on three factors: threshold function, cost function, and kernel function. K-NN is a non-parametric versatile learning algorithm also used for classification and regression problems. Instead of learning the discriminative function, the algorithm memorizes the training dataset. By minimizing the training set intense based learning helps to avoid error. The disadvantages of K-NN are ample memory storage, long prediction time, and unnecessary sensitivity to irrelevant features. But when the data size is limited, K-NN works better than any other supervised learning algorithm. A decision tree dendritic classification model is used for both classification and regression problems. Breaking into smaller subsets, the classification process can be done, and based on this, feature selection can be done. The final structure is like tree branches, and each node highlights the feature. Regression analysis provides user equation for graph for the prediction of the data. It always shows the weighted average value for the prediction purpose. Through the statistical analysis it can predict the accurate output. Most elementary statistical courses cover fundamental techniques, like making scatter plots and performing linear regression. The most suitable algorithm can be found based on the overall accuracy rate, prediction speed, and training time.
For the experiment, features are divided into two categories randomly in the ratio of 70:30. In most cases, 70% of data are used for training purposes, and 30% are used for testing data for evaluation. For all the algorithms, the rule is same. The sample data size is 5000 like ANN and ANFIS. For feature extraction purpose 300 data samples have been formed for better analysis. Among these 70% is used for training purpose, 15%for cross validation and rest 15% for testing purpose. The entire diagnosis is carried through the MATLAB pattern recognition and classification learner apps. Based on the evaluation, the accuracy rate of each algorithm is obtained using the formula. With the help of the classification learner app, each algorithm's accuracy rate, prediction speed, and training time have been analyzed and compared.
From the Table 10 it is seen that performance of the K-NN and ANN are better for this research. But based on accuracy rate, prediction speed and training time K-NN is more suitable than ANN. Figure 13 picturizes the overall accuracy of the ML algorithms.

Conclusion
This article explains inter-turn fault analysis of induction motor-based pumping system, and the parameter changes during fault situations in different turn conditions have been shown. The simulation results have been verified through the HIL loop-based (OP5700) device, and the motor's phase current increases when the fault occurs. Once current increases, speed, and torque also increase, affecting the pumping system. Speed helps to increase the flow rate of the pump suddenly, and it causes a huge pressure drop and a decrease in head value. If pressure drops drastically, a cavitation problem occurs, and sudden increase in flow rate causes huge vibration (36) Accuracy rate(%) = Number of data diagnosed properly Total number of data diagnosed × 100. www.nature.com/scientificreports/ in the pipe, which causes a water hammering problem. In this research, at first, ANN and ANFIS algorithmbased models identification and prediction of the fault have been done. Both the techniques are used, and it is seen that ANN performs better than ANFIS, based on RMSE and R 2 values. Various other research works are also compared with the proposed work to find out the new development in the proposed work. It is observed that the proposed research is suitable for industrial application and can easily identify the faulty condition for a large amount of data. In the future, the ANN would have been used for other fault detections in motor and pumping systems and for other machineries and can become a comprehensive diagnosis technique. The authors also compared various ML algorithms with ANN and ANFIS, among which, based on accuracy rate, prediction speed and training time it is seen that K-NN and ANN can work better for the proposed research. But based on overall accuracy rate K-NN works better than ANN. In addition, the deployment of the developed technique in a laboratory environment is an extension of the present work. More researches are possible through HIL based OP5700 device to verify the simulation results.

Data availability
All data generated or analysed during this study are included in this published article [and its supplementary information files]. In supplementary file all the data in table form has been added. Further if someone wants to request the data from this study should contact with corresponding author or first author.

Funding
There is no source of funding for this research activity.